25 research outputs found

    Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

    Full text link
    This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.Comment: arXiv admin note: substantial text overlap with arXiv:1503.0625

    Cancer Incidence, Mortality, Years of Life Lost, Years Lived With Disability, and Disability-Adjusted Life Years for 29 Cancer Groups From 2010 to 2019: A Systematic Analysis for the Global Burden of Disease Study 2019.

    Get PDF
    The Global Burden of Diseases, Injuries, and Risk Factors Study 2019 (GBD 2019) provided systematic estimates of incidence, morbidity, and mortality to inform local and international efforts toward reducing cancer burden. To estimate cancer burden and trends globally for 204 countries and territories and by Sociodemographic Index (SDI) quintiles from 2010 to 2019. The GBD 2019 estimation methods were used to describe cancer incidence, mortality, years lived with disability, years of life lost, and disability-adjusted life years (DALYs) in 2019 and over the past decade. Estimates are also provided by quintiles of the SDI, a composite measure of educational attainment, income per capita, and total fertility rate for those younger than 25 years. Estimates include 95% uncertainty intervals (UIs). In 2019, there were an estimated 23.6 million (95% UI, 22.2-24.9 million) new cancer cases (17.2 million when excluding nonmelanoma skin cancer) and 10.0 million (95% UI, 9.36-10.6 million) cancer deaths globally, with an estimated 250 million (235-264 million) DALYs due to cancer. Since 2010, these represented a 26.3% (95% UI, 20.3%-32.3%) increase in new cases, a 20.9% (95% UI, 14.2%-27.6%) increase in deaths, and a 16.0% (95% UI, 9.3%-22.8%) increase in DALYs. Among 22 groups of diseases and injuries in the GBD 2019 study, cancer was second only to cardiovascular diseases for the number of deaths, years of life lost, and DALYs globally in 2019. Cancer burden differed across SDI quintiles. The proportion of years lived with disability that contributed to DALYs increased with SDI, ranging from 1.4% (1.1%-1.8%) in the low SDI quintile to 5.7% (4.2%-7.1%) in the high SDI quintile. While the high SDI quintile had the highest number of new cases in 2019, the middle SDI quintile had the highest number of cancer deaths and DALYs. From 2010 to 2019, the largest percentage increase in the numbers of cases and deaths occurred in the low and low-middle SDI quintiles. The results of this systematic analysis suggest that the global burden of cancer is substantial and growing, with burden differing by SDI. These results provide comprehensive and comparable estimates that can potentially inform efforts toward equitable cancer control around the world.Funding/Support: The Institute for Health Metrics and Evaluation received funding from the Bill & Melinda Gates Foundation and the American Lebanese Syrian Associated Charities. Dr Aljunid acknowledges the Department of Health Policy and Management of Kuwait University and the International Centre for Casemix and Clinical Coding, National University of Malaysia for the approval and support to participate in this research project. Dr Bhaskar acknowledges institutional support from the NSW Ministry of Health and NSW Health Pathology. Dr Bärnighausen was supported by the Alexander von Humboldt Foundation through the Alexander von Humboldt Professor award, which is funded by the German Federal Ministry of Education and Research. Dr Braithwaite acknowledges funding from the National Institutes of Health/ National Cancer Institute. Dr Conde acknowledges financial support from the European Research Council ERC Starting Grant agreement No 848325. Dr Costa acknowledges her grant (SFRH/BHD/110001/2015), received by Portuguese national funds through Fundação para a Ciência e Tecnologia, IP under the Norma Transitória grant DL57/2016/CP1334/CT0006. Dr Ghith acknowledges support from a grant from Novo Nordisk Foundation (NNF16OC0021856). Dr Glasbey is supported by a National Institute of Health Research Doctoral Research Fellowship. Dr Vivek Kumar Gupta acknowledges funding support from National Health and Medical Research Council Australia. Dr Haque thanks Jazan University, Saudi Arabia for providing access to the Saudi Digital Library for this research study. Drs Herteliu, Pana, and Ausloos are partially supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNDS-UEFISCDI, project number PN-III-P4-ID-PCCF-2016-0084. Dr Hugo received support from the Higher Education Improvement Coordination of the Brazilian Ministry of Education for a sabbatical period at the Institute for Health Metrics and Evaluation, between September 2019 and August 2020. Dr Sheikh Mohammed Shariful Islam acknowledges funding by a National Heart Foundation of Australia Fellowship and National Health and Medical Research Council Emerging Leadership Fellowship. Dr Jakovljevic acknowledges support through grant OI 175014 of the Ministry of Education Science and Technological Development of the Republic of Serbia. Dr Katikireddi acknowledges funding from a NHS Research Scotland Senior Clinical Fellowship (SCAF/15/02), the Medical Research Council (MC_UU_00022/2), and the Scottish Government Chief Scientist Office (SPHSU17). Dr Md Nuruzzaman Khan acknowledges the support of Jatiya Kabi Kazi Nazrul Islam University, Bangladesh. Dr Yun Jin Kim was supported by the Research Management Centre, Xiamen University Malaysia (XMUMRF/2020-C6/ITCM/0004). Dr Koulmane Laxminarayana acknowledges institutional support from Manipal Academy of Higher Education. Dr Landires is a member of the Sistema Nacional de Investigación, which is supported by Panama’s Secretaría Nacional de Ciencia, Tecnología e Innovación. Dr Loureiro was supported by national funds through Fundação para a Ciência e Tecnologia under the Scientific Employment Stimulus–Institutional Call (CEECINST/00049/2018). Dr Molokhia is supported by the National Institute for Health Research Biomedical Research Center at Guy’s and St Thomas’ National Health Service Foundation Trust and King’s College London. Dr Moosavi appreciates NIGEB's support. Dr Pati acknowledges support from the SIAN Institute, Association for Biodiversity Conservation & Research. Dr Rakovac acknowledges a grant from the government of the Russian Federation in the context of World Health Organization Noncommunicable Diseases Office. Dr Samy was supported by a fellowship from the Egyptian Fulbright Mission Program. Dr Sheikh acknowledges support from Health Data Research UK. Drs Adithi Shetty and Unnikrishnan acknowledge support given by Kasturba Medical College, Mangalore, Manipal Academy of Higher Education. Dr Pavanchand H. Shetty acknowledges Manipal Academy of Higher Education for their research support. Dr Diego Augusto Santos Silva was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil Finance Code 001 and is supported in part by CNPq (302028/2018-8). Dr Zhu acknowledges the Cancer Prevention and Research Institute of Texas grant RP210042

    Model Reduction for Simulation, Optimization and Control

    Get PDF
    Many tasks of simulation, optimization and control can be performed more efficiently if the intermediate complexity of the numerical model is reduced. In our work, we investigate model reduction, as applied to reaction-transport systems of atmospheric chemistry. We use a Proper Orthogonal Decomposition-based approach to extract information from a set of model observations, and to project the model equations onto a reduced order space chosen in such a way that the essential model behavior is preserved in the solution of the reduced version. We examine and improve many features of the method. In particular, we show how to measure sensitivities of the model reduction process, and use the results to select the placement and weighting of observations to best reproduce specific events in the full model behavior; we also develop novel techniques allowing to take into account multiple events We show how to construct reduced models to replace the full model in iterative parameter optimization procedures so that fewer steps and lower computational budget are needed. The result of the study is a more complete understanding of how to perform tasks of simulation and optimization of nonlinear models using model reduction tools

    Learning of Highly-Filtered Data Manifold Using Spectral Methods

    No full text
    Abstract. We propose a scheme for improving existing tools for recovering and predicting decisions based on singular value decomposition. Our main contribution is an investigation of advantages of using a functional, rather than linear approximation of the response of an unknown, complicated model. A significant attractive feature of the method is the demonstrated ability to make predictions based on a highly filtered data set. An adaptive high-order interpolation is constructed, that estimates the relative probability of each possible decision. The method uses a flexible nonlinear basis, capable of utilizing all the available information. We demonstrate that the prediction can be based on a very small fraction of the training set. The suggested approach is relevant in the general field of manifold learning, as a tool for approximating the response of the models based on many parameters. Our experiments show that the approach is at least competitive with other latent factor prediction methods, and that the precision of prediction grows with the increase in the order of the polynomial basis

    Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

    No full text
    <div><p>This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.</p></div

    Computational time in seconds (not including the REM method).

    No full text
    <p>Computational time in seconds (not including the REM method).</p

    Healthcare datasets.

    No full text
    <p>The set “Example 1” has 10000 observations in each class. In set “Example 2”, the majority and minority classes contain 50400, and 33600 observations, respectively. For details about the data see [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0155119#pone.0155119.ref008" target="_blank">8</a>].</p
    corecore